Substructure counting graph kernels for machine learning from RDF data

نویسندگان

Gerben de Vries

Steven de Rooij

چکیده

In this paper we introduce a framework for learning from RDF data using graph kernels that count substructures in RDF graphs, which systematically covers most of the existing kernels previously defined and provides a number of new variants. Our definitions include fast kernel variants that are computed directly on the RDF graph. To improve the performance of these kernels we detail two strategies. The first strategy involves ignoring the vertex labels that have a low frequency among the instances. Our second strategy is to remove hubs to simplify the RDF graphs. We test our kernels in a number of classification experiments with real-world RDF datasets. Overall the kernels that count subtrees show the best performance. However, they are closely followed by simple bag of labels baseline kernels. The direct kernels substantially decrease computation time, while keeping performance the same. For the walks counting kernel the decrease in computation time of the approximation is so large that it thereby becomes a computationally viable kernel to use. Ignoring low frequency labels improves the performance for all datasets. The hub removal algorithm increases performance on two out of three of our smaller datasets, but has little impact when used on our larger datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph Kernels for RDF Data

The increasing availability of structured data in Resource Description Framework (RDF) format poses new challenges and opportunities for data mining. Existing approaches to mining RDF have only focused on one specific data representation, one specific machine learning algorithm or one specific task. Kernels, however, promise a more flexible approach by providing a powerful framework for decoupl...

متن کامل

Predicting Quality of Crowdsourced Annotations Using Graph Kernels

Annotations obtained by Cultural Heritage institutions from the crowd need to be automatically assessed for their quality. Machine learning using graph kernels is an effective technique to use structural information in datasets to make predictions. We employ the WeisfeilerLehman graph kernel for RDF to make predictions about the quality of crowdsourced annotations in Steve.museum dataset, which...

متن کامل

RDF2Vec: RDF Graph Embeddings for Data Mining

Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsu...

متن کامل

RDF2Vec: RDF Graph Embeddings and Their Applications

Linked Open Data has been recognized as a valuable source for background information in many data mining and information retrieval tasks. However, most of the existing tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that u...

متن کامل

A Fast Approximation of the Weisfeiler-Lehman Graph Kernel for RDF Data

We introduce an approximation of the Weisfeiler-Lehman graph kernel algorithm aimed at improving the computation time of the kernel when applied to Resource Description Framework (RDF) data. RDF is the representation/storarge format of the semantic web and it essentially represents a graph. One direction for learning from the semantic web is using graph kernel methods on RDF. This is a very gen...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

J. Web Sem.

دوره 35 شماره

صفحات -

تاریخ انتشار 2015

Substructure counting graph kernels for machine learning from RDF data

نویسندگان

چکیده

منابع مشابه

Graph Kernels for RDF Data

Predicting Quality of Crowdsourced Annotations Using Graph Kernels

RDF2Vec: RDF Graph Embeddings for Data Mining

RDF2Vec: RDF Graph Embeddings and Their Applications

A Fast Approximation of the Weisfeiler-Lehman Graph Kernel for RDF Data

عنوان ژورنال:

اشتراک گذاری